logo

Final Report Submission by

Name-SAHELI SAHA

Matriculation no:813710

Subject-Data Vizualization

METHOD

1.Introduction

2.Converting the data into appropriate format

3.Cleaning of the data-Renaming the column, indexing the rows, deleting and editing columns.

4.Choosing region of interest-In our case Germany and India.

5.Performing descriptive statistics on our region of interest.

6.Comparing the region of interest-Comparing the changes in mobility pattern (Plotting different graphs ).

7.Comparing our region of interest with other major cities worldwide.

8.Interpretation of result

9.conclusion

10.References

Introduction:

The World Health Organization (WHO) characterized COVID-19 as a pandemic in the year 2020, pointing to over 3 million cases and 207,973 deaths in 213 countries and territories. The infection has not only become a public health crisis but has also affected the global economy. Significant economic impact has already occurred across the globe due to reduced productivity, loss of life, business closures, trade disruption, and decimation of the tourism industry.

Data description

Apple’s mobility trend reports show how human mobility has changed in different countries and cities worldwide since January 2020 and are based on location data of Apple’s “maps” services. It is designed to help mitigate the spread of COVID-19, provide governments, research institutions, health authorities, and the general public with insights on the effects on human mobility of national and regional lockdown policies. The data covers 63 countries, 530 provinces and all major cities worldwide.

Raw data (apple_mobility)

Let us veiw the data provided by loading the csv file in R by using read function. The read.csv() function gives the output as a data frame. This can be easily checked as follows.

#loading the csv file in R by using read function
apple_mobility <- read.csv("applemobilitytrends.csv", header = FALSE,sep = ",")
#Returns the first or last parts of a data frame.
head(apple_mobility)

we can see the data is not properly structured.No proper column names are there .There are a lots of missing elements and unnecessary information.

Objective

We used Mobility Trends Reports published by Apple to assess the impact of government policies on mobility changes across cities during COVID.

The following research question is examined:

How is the mobility pattern across different region is effected by differing public policies and restrictions imposed across nations worldwide? And, whether the patterns significantly vary across global regions?

The given data for the countries and cities shows for each of three modes of transport to move around i.e driving,walking and transit. The time series of the data starts on January 13th and, at the time of writing, continues till July 20th. The mobility measures for every country or city are indexed to 100 at the beginning of the series, so trends are relative to that baseline.

we can play around to see and gather information about the data provided. For example we can check the number of columns and rows are there in our data.We can see to check whether or not the data we converted from csv is converted to data frame . The following codes shows that our data has been converted and stored into a data frame under the name apple_mobility and the number of rows in the data is 176 while the number of column in the data is 3626.

print(is.data.frame(apple_mobility))
## [1] TRUE
#To check the whether the given data is converted to data frame or not
print(ncol(apple_mobility))
## [1] 176
#To count the number of columns
print(nrow(apple_mobility))
## [1] 3626
#To count the number of rows

Alternatively we can use the dim function which returns the dimension (e.g. the number of columns and rows) of the data frame. `

dim(apple_mobility)
## [1] 3626  176

For summary of our data we are using str function which Compactly display the internal structure of an R object, a diagnostic function and is an alternative to summary.

str(apple_mobility)
## 'data.frame':    3626 obs. of  176 variables:
##  $ V1  : chr  "geo_type" "country/region" "country/region" "country/region" ...
##  $ V2  : chr  "region" "Albania" "Albania" "Argentina" ...
##  $ V3  : chr  "transportation_type" "driving" "walking" "driving" ...
##  $ V4  : chr  "alternative_name" "" "" "" ...
##  $ V5  : chr  "sub-region" "" "" "" ...
##  $ V6  : chr  "country" "" "" "" ...
##  $ V7  : chr  "2020-01-13" "100.0" "100.0" "100.0" ...
##  $ V8  : chr  "2020-01-14" "95.3" "100.68" "97.07" ...
##  $ V9  : chr  "2020-01-15" "101.43" "98.93" "102.45" ...
##  $ V10 : chr  "2020-01-16" "97.2" "98.46" "111.21" ...
##  $ V11 : chr  "2020-01-17" "103.55" "100.85" "118.45" ...
##  $ V12 : chr  "2020-01-18" "112.67" "100.13" "124.01" ...
##  $ V13 : chr  "2020-01-19" "104.83" "82.13" "95.44" ...
##  $ V14 : chr  "2020-01-20" "94.39" "95.65" "95.13" ...
##  $ V15 : chr  "2020-01-21" "94.07" "97.78" "95.42" ...
##  $ V16 : chr  "2020-01-22" "93.51" "95.39" "97.66" ...
##  $ V17 : chr  "2020-01-23" "92.94" "94.24" "99.42" ...
##  $ V18 : chr  "2020-01-24" "102.13" "93.73" "113.34" ...
##  $ V19 : chr  "2020-01-25" "102.38" "97.06" "118.23" ...
##  $ V20 : chr  "2020-01-26" "101.41" "77.27" "91.31" ...
##  $ V21 : chr  "2020-01-27" "94.62" "83.37" "93.37" ...
##  $ V22 : chr  "2020-01-28" "89.12" "82.73" "91.12" ...
##  $ V23 : chr  "2020-01-29" "90.17" "84.39" "92.35" ...
##  $ V24 : chr  "2020-01-30" "90.21" "88.19" "96.74" ...
##  $ V25 : chr  "2020-01-31" "97.71" "90.79" "111.24" ...
##  $ V26 : chr  "2020-02-01" "102.5" "88.7" "123.96" ...
##  $ V27 : chr  "2020-02-02" "108.92" "79.32" "89.01" ...
##  $ V28 : chr  "2020-02-03" "92.82" "87.12" "91.66" ...
##  $ V29 : chr  "2020-02-04" "91.48" "88.06" "89.18" ...
##  $ V30 : chr  "2020-02-05" "93.99" "99.4" "94.49" ...
##  $ V31 : chr  "2020-02-06" "96.72" "85.84" "95.98" ...
##  $ V32 : chr  "2020-02-07" "102.46" "94.63" "111.12" ...
##  $ V33 : chr  "2020-02-08" "103.29" "99.74" "121.53" ...
##  $ V34 : chr  "2020-02-09" "107.83" "81.41" "89.23" ...
##  $ V35 : chr  "2020-02-10" "87.99" "90.19" "96.42" ...
##  $ V36 : chr  "2020-02-11" "94.18" "90.45" "96.97" ...
##  $ V37 : chr  "2020-02-12" "94.62" "94.16" "101.68" ...
##  $ V38 : chr  "2020-02-13" "99.7" "95.69" "104.9" ...
##  $ V39 : chr  "2020-02-14" "139.3" "109.21" "122.91" ...
##  $ V40 : chr  "2020-02-15" "123.9" "108.4" "127.62" ...
##  $ V41 : chr  "2020-02-16" "129.41" "84.52" "88.2" ...
##  $ V42 : chr  "2020-02-17" "102.24" "96.63" "92.28" ...
##  $ V43 : chr  "2020-02-18" "88.09" "87.38" "98.15" ...
##  $ V44 : chr  "2020-02-19" "88.17" "81.49" "98.96" ...
##  $ V45 : chr  "2020-02-20" "94.48" "87.21" "104.71" ...
##  $ V46 : chr  "2020-02-21" "100.62" "93.52" "132.57" ...
##  $ V47 : chr  "2020-02-22" "111.02" "94.5" "141.76" ...
##  $ V48 : chr  "2020-02-23" "113.5" "74.8" "108.21" ...
##  $ V49 : chr  "2020-02-24" "86.95" "87.25" "107.92" ...
##  $ V50 : chr  "2020-02-25" "83.01" "81.41" "99.51" ...
##  $ V51 : chr  "2020-02-26" "85.44" "77.78" "96.38" ...
##  $ V52 : chr  "2020-02-27" "82.73" "74.5" "97.96" ...
##  $ V53 : chr  "2020-02-28" "89.7" "81.19" "116.77" ...
##  $ V54 : chr  "2020-02-29" "95.48" "87.97" "117.72" ...
##  $ V55 : chr  "2020-03-01" "100.43" "78.42" "81.13" ...
##  $ V56 : chr  "2020-03-02" "89.25" "94.88" "86.76" ...
##  $ V57 : chr  "2020-03-03" "91.02" "95.65" "88.9" ...
##  $ V58 : chr  "2020-03-04" "89.72" "91.13" "92.74" ...
##  $ V59 : chr  "2020-03-05" "94.87" "94.46" "97.45" ...
##  $ V60 : chr  "2020-03-06" "102.82" "103.8" "118.32" ...
##  $ V61 : chr  "2020-03-07" "109.92" "92.92" "122.78" ...
##  $ V62 : chr  "2020-03-08" "110.07" "82.47" "80.36" ...
##  $ V63 : chr  "2020-03-09" "79.68" "87.25" "89.28" ...
##  $ V64 : chr  "2020-03-10" "68.24" "63.37" "89.29" ...
##  $ V65 : chr  "2020-03-11" "51.77" "46.7" "82.73" ...
##  $ V66 : chr  "2020-03-12" "43.79" "37.87" "86.67" ...
##  $ V67 : chr  "2020-03-13" "24.99" "31.43" "98.75" ...
##  $ V68 : chr  "2020-03-14" "24.61" "37.78" "84.77" ...
##  $ V69 : chr  "2020-03-15" "30.93" "37.44" "47.7" ...
##  $ V70 : chr  "2020-03-16" "24.69" "35.95" "53.57" ...
##  $ V71 : chr  "2020-03-17" "24.95" "30.87" "45.44" ...
##  $ V72 : chr  "2020-03-18" "24.65" "33.56" "43.21" ...
##  $ V73 : chr  "2020-03-19" "24.5" "36.84" "43.9" ...
##  $ V74 : chr  "2020-03-20" "26.31" "32.62" "16.77" ...
##  $ V75 : chr  "2020-03-21" "20.39" "27.12" "12.54" ...
##  $ V76 : chr  "2020-03-22" "19.29" "22.64" "8.74" ...
##  $ V77 : chr  "2020-03-23" "22.62" "25.93" "10.08" ...
##  $ V78 : chr  "2020-03-24" "21.61" "25.88" "10.76" ...
##  $ V79 : chr  "2020-03-25" "21.98" "25.42" "14.86" ...
##  $ V80 : chr  "2020-03-26" "23.07" "23.03" "14.58" ...
##  $ V81 : chr  "2020-03-27" "23.94" "28.27" "15.45" ...
##  $ V82 : chr  "2020-03-28" "19.49" "24.69" "13.34" ...
##  $ V83 : chr  "2020-03-29" "21.78" "20.26" "9.92" ...
##  $ V84 : chr  "2020-03-30" "23.66" "22.9" "14.59" ...
##  $ V85 : chr  "2020-03-31" "25.11" "25.93" "12.65" ...
##  $ V86 : chr  "2020-04-01" "25.02" "23.5" "15.79" ...
##  $ V87 : chr  "2020-04-02" "25.2" "26.65" "16.48" ...
##  $ V88 : chr  "2020-04-03" "24.11" "27.76" "17.15" ...
##  $ V89 : chr  "2020-04-04" "20.54" "24.9" "15.13" ...
##  $ V90 : chr  "2020-04-05" "22.54" "23.92" "11.53" ...
##  $ V91 : chr  "2020-04-06" "26.4" "31.13" "17.24" ...
##  $ V92 : chr  "2020-04-07" "26.03" "25.8" "18.55" ...
##  $ V93 : chr  "2020-04-08" "26.43" "29.81" "19.95" ...
##  $ V94 : chr  "2020-04-09" "26.7" "28.96" "19.8" ...
##  $ V95 : chr  "2020-04-10" "26.32" "27.29" "16.8" ...
##  $ V96 : chr  "2020-04-11" "25.47" "27.63" "19.4" ...
##  $ V97 : chr  "2020-04-12" "24.89" "29.59" "12.89" ...
##  $ V98 : chr  "2020-04-13" "32.64" "35.52" "21.1" ...
##  $ V99 : chr  "2020-04-14" "31.43" "38.08" "22.29" ...
##   [list output truncated]

#use of differnet functions to veiw the data

To display specific rows and column of the data we can use the head function along with the index of the rows and column number provided. eg, To display column 1 to 3 and 10 of the first part of data.

head(apple_mobility[, c(1:3, 10)])

Now we have seen the data and have found out that there is a large number of places given along with the modes of transportation and how it changes from the month of January to may, but we don’t know how many cities or countries are listed in our data frame. so, Lets use the table function to display the number of city, region,country and sub-region. we can see there are 786 cities ,2090 countries, 596 regions and 153 counries or regions are listed in our data.

we are using ! inside the function table to omit the element name geo_type from printing out

df<- apple_mobility
table(df$V1[df$V1 !="geo_type"])
## 
##           city country/region         county     sub-region 
##            786            153           2090            596

Now that we have got the number of city,region,country and sub-region listed in our data, we can print its name to get a better understanding.we can see how many and what are the names of the countries are listed in the data set.The column V6 contain all the names of the places but the names are repeated multiple times so to get rid of printing the names more than once we will use the following code along with function “unique”.

By using unique command we can skip the repeating country names The symbol ! will help us to filter out the arguments provided in the data from printing out. for eg.!=country will print everything in that particular column of V6 except the name “country”.

a<-apple_mobility$V6[apple_mobility$V6 !=""][apple_mobility$V6 !="country"]
#the symbol ! will help us to filter out the arguments provided in the data from printing out. for eg.!=country will print everything in that particular column of V6 except the name "country".
unique(a)
##  [1] "Germany"              "Australia"            "United States"       
##  [4] "Netherlands"          "Turkey"               "Belgium"             
##  [7] "Greece"               "New Zealand"          "India"               
## [10] "Thailand"             "Spain"                "Switzerland"         
## [13] "United Kingdom"       "Brazil"               "Italy"               
## [16] "France"               "Romania"              "Hungary"             
## [19] "Argentina"            "Egypt"                "Canada"              
## [22] "South Africa"         "Morocco"              "Taiwan"              
## [25] "Denmark"              "Indonesia"            "United Arab Emirates"
## [28] "Ireland"              "Japan"                "Poland"              
## [31] "Sweden"               "Austria"              "Mexico"              
## [34] "Vietnam"              "Finland"              "Malaysia"            
## [37] "Russia"               "Portugal"             "Philippines"         
## [40] "Norway"               "Czech Republic"       "Saudi Arabia"        
## [43] "Chile"                "Republic of Korea"    "Israel"              
## [46] "Slovakia"             "Luxembourg"           NA

we can see all the country or regions use different means of transportation.To calculate the types of transportation use by each country and to know how many counties are using each of it the following code is use. we can see only three means of transport is being used and those are driving,transit and walking.And among all of it driving is used the most as 3048 countries/regions are using it while 355 use walking and the least of all these i.e 222 use transit.The following code will show the result.

df<- apple_mobility
df1<-table(df$V3[df$V3 !="transportation_type"])
df1
## 
## driving transit walking 
##    3048     222     355

The above data can be represented by using a pie chart for a better view .We will use parameter main to add a title to the chart.

Pie chart

pie(df1,main = "Means of trasportation use by differnt countries")

We have created our first pie chart but it is very simple.we can expand the features of the chart by adding more parameters to the function. We will use parameter col which will make use of rainbow color pallet while drawing the chart. The length of the pallet should be same as the number of values we have for the chart. Hence we use length(df1).

pie(df1,main = "Means of trasportation use by differnt countries",col = rainbow(length(df1)))
legend("topright", c("driving","transit","walking"), cex = 0.8,
   fill = rainbow(length(df1)))

We can make a 3D pie chart with 3 dimensions.It can be drawn by using an additional package plotrix.Let us download and load the package plotrix first.

library(plotrix)

3D Pie chart

The package plotrix has a function called pie3D() that is been used here.The parameter labels is used to add labels the slices of our pie chart and explode parameter to explode the slices of pie chart.

explode for The amount to explode the pie main to add An overall title for the plot theta for angle of viewing in radians

pie3D(df1,labels = c("driving","transit","walking"),explode = 0.1,theta = 0.5, main = "3D veiw of trasportation means use by differnt countries")
legend("topright", c("driving","transit","walking"), cex = 0.8,
   fill = rainbow(length(df1)))

#explode    for The amount to explode the pie
#main to add An overall title for the plot
#theta for angle of viewing in radians

The following libraries are required for proceeding further.Download and load the packages.

library(tidyr)
library(ggplot2)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

Few more packages needed.Also download and load them.

library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v tibble  3.1.0     v stringr 1.4.0
## v readr   1.4.0     v forcats 0.5.1
## v purrr   0.3.4
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(lubridate)
## 
## Attaching package: 'lubridate'
## The following objects are masked from 'package:base':
## 
##     date, intersect, setdiff, union

Cleaning of the data-Renaming the column, indexing the rows, deleting and editing columns

Changing column names

The name of the columns in our data is given as V1,V2….V176.Lets change the column name which is given as V1,V2… to their original name which is given as their first element so that our data frame looks good and clean and is easy to read.

colnames(apple_mobility) <- apple_mobility[1,]
apple_mobility<- apple_mobility[-1, ] 

let us check if the column names have been changed by using head function.

head(apple_mobility)

Changing row index

So the column name has been changed perfectly and our data is readable but as we can still see our data is not properly indexed by the row.The column names have been changed but row index is not fixed(it starts from 2). so by using the following codes we can index the rows properly.

rownames(apple_mobility) = 1:nrow(apple_mobility)
head(apple_mobility)

New data

Column number 4,5,6 contain information such as alternative name,sub-region and country. lets get rid of unwanted column to get a much cleaner and nicer data.

apple_mobility[-c(4,5,6)]
head(apple_mobility)

Long form of the data

##changing data to long form and changing the date format

Before proceeding further for graph lets convert our data into long form.By using the long form of the data it will be rather easy to create graphs.For this purpose we will use the “pivot_longer” function. There are a lot of modifications has to be done to our data such as removing the extra unnecessary columns and renaming few columns to get a cleaner data .Also the date column has to change into date type. The new data frame is saved under the name apple_mobility_new.

#lets store our new data in apple_mobility_new
apple_mobility_new<-apple_mobility %>%
  pivot_longer(-c('geo_type','region','transportation_type','alternative_name','sub-region','country'),
               names_to = "date", #to put all the dates under new column date
               values_to = "mobility") %>% #to put all the values of the date column under new percentage column
  select(-c('alternative_name','sub-region')) %>% #deleting unwanted column using select -c function
  rename('Country/region'='region') %>% #renaming the column region to Country?region
   mutate(date=ymd(date)) #using mutate and ymd(i.e year month date format of our date to date)
head(apple_mobility_new)

Changing mobility format

Now we have got a much cleaner data and also the date column has been converted to date type.But we can see still the values in the mobility column is character type but we need the vaues in numeric form to plot it in graphs.To convert it into numeric type , lets use the function transform.

##using as.numeric in trasform function,chr type values of mobility will be converted to dbl type values

apple_mobility_new <- transform(apple_mobility_new, mobility=as.numeric(mobility))
head(apple_mobility_new )
library(cowplot)
## 
## Attaching package: 'cowplot'
## The following object is masked from 'package:lubridate':
## 
##     stamp

Choosing region of interest-In our case Germany and India

Our selected region of interest include mainly Germany and India to analyze the changes in urban mobility due to COVID-19.Later we will compare two major cities of these two country with the other major cities in the world.Our goal is to build an approach to analyze the regions that can be applied and compared with a range of other regions and make a conclusion from it.

selected region 1- GERMANY

Descriptive statistics

Let us create 3 data frames for our region of interest i.e Germany depending upon transportation type before proceeding into descriptive statistics.We will use filter function for that.

German_mobility<-apple_mobility_new %>% filter(`Country.region`== "Germany")
dri<-apple_mobility_new %>% filter(`Country.region`== "Germany",`transportation_type`== "driving")
walk<-apple_mobility_new %>% filter(`Country.region`== "Germany",`transportation_type`== "walking")
trans<-apple_mobility_new %>% filter(`Country.region`== "Germany",`transportation_type`== "transit")

using summary function we can calculate the descriptive statistics of each means of transportation.

For driving mode of transportation the minimum value is 37.90 which is the value of 21 March 2020 and the maximum is 145.55 which is the mobility of 26 June 2020.It can be clearly seen in the line plot in the later graphs.

For walking mode of transportation the minimum value is 37.90 which is the value of 29 March 2020 and the maximum is 161.49 which is the mobility of 16th Feb 2020.

For transit mode of transportation the minimum value is 37.90 which is the value of 21 March 2020 and the maximum is 157.64 which is the mobility of 15th Feb 2020.

The mean ,median,1st and 3rd quartiles values are also given respectively.All these values will be plotted as graphs for better view in the next graph.

summary(dri$mobility)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   37.90   69.02  102.55   93.55  112.87  145.55       2
summary(walk$mobility)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   33.95   69.92   96.28   92.98  110.77  161.49       2
summary(trans$mobility)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   29.10   46.98   89.52   81.70  106.00  157.64       2

BOXPLOT for Germany

A boxplot graphically represents the distribution of a quantitative variable by visually displaying five common location summary (minimum, median, first/third quartiles and maximum) and any observation that was classified as a suspected outlier using the interquartile range (IQR) criterion. The minimum and maximum in the boxplot are represented without these suspected outliers.

Boxplots are even more informative when presented side-by-side for comparing and contrasting distributions from two or more groups. For instance, we compare the length of the sepal across the different species.

boxplot(German_mobility$mobility ~ German_mobility$transportation_type,col = c("sienna","palevioletred1","royalblue2"),main="Bagplot Example")

library(lattice)

Dotplot

A dotplot is more or less similar than a boxplot, except that observations are represented as points and there is no summary statistics presented on the plot.

dotplot(German_mobility$mobility ~ German_mobility$transportation_type)

Histogram

hist(dri$mobility,col="orange")

hist(walk$mobility,col="yellow")

hist(trans$mobility,col="pink")

Scatterplot

Scatterplots allow to check whether there is a potential link between two quantitative variables. For this reason, scatterplots are often used to visualize a potential correlation between two variables for eg .in our case it tells us how mobility is changing or varying with respect to date.

Here is our first scattered plot of all the means of transportation used by the county Germany.The function par(mfrow) function is handy for creating a simple multi-paneled plot.We can see mobility of all three modes of transport gradually decreases after 13th of March and slowly rising from April.

par(mfrow=c(1,1))
plot(dri$date,dri$mobility,col="red")+
plot(walk$date,walk$mobility,col="blue")

## integer(0)
plot(trans$date,trans$mobility,col="brown")

#Line plot

Line Plot 1- Driving

For starting lets view a region of interest, in our case ‘Germany’ and create a line plot of the mobility changes of Germany by using only one transportation type i.e driving. The figure shows very dramatic drops in mid March, driving is showing clear signs of bouncing back that it remains well below the benchmark period.

`

apple_mobility_new %>% filter(`Country.region`== "Germany",`transportation_type`== "driving")%>% #filtering the data to get only the values of driving in Germany
  ggplot(aes(x=date,y=mobility))+ #plotting the date values in x axis and mobility in y axis
  geom_line(color="red")+ #plotting the line with color red
  geom_point()+           #plotting the points as well
  labs(title = "Germany's driving changes")#label of the graph
## Warning: Removed 2 rows containing missing values (geom_point).

Line Plot 2- Walking

apple_mobility_new %>% filter(`Country.region`== "Germany",`transportation_type`== "walking")%>% #filtering the data to get only the values of walking in Germany
  ggplot(aes(x=date,y=mobility))+ #plotting the date values in x axis and mobility in y axis
  geom_line(color="blue")+ #plotting the line with blue red
  geom_point()+           #plotting the points as well
  labs(title = "Germany's walking changes")#label of the graph
## Warning: Removed 2 rows containing missing values (geom_point).

Line Plot 3- Transit

apple_mobility_new %>% filter(`Country.region`== "Germany",`transportation_type`== "transit")%>% #filtering the data to get only the values of transit in Germany
  ggplot(aes(x=date,y=mobility))+ #plotting the date values in x axis and mobility in y axis
  geom_line(color="darkgreen")+ #plotting the line with darkgreen red
  geom_point()+           #plotting the points as well
  labs(title = "Germany's transit changes")#label of the graph
## Warning: Removed 2 rows containing missing values (geom_point).

Line plot - all combined We can now show all the three means of transportation of Germany and how its mobility rate changes from January 13 to July 30 in a single graph. we got all the lines along with the points but the graph is not quite readable due to the overlapping of all the points.

apple_mobility_new %>% filter(`Country.region`== "Germany")%>% #filtering the data to get only the values of transportation in Germany
  ggplot(aes(x=date,y=mobility,group=transportation_type))+ #plotting the date values in x axis and mobility in y axis
  geom_line(aes(color=transportation_type),size = 1)+
  geom_point()+ #plotting the points as well
  labs(title = "changes in mobility in Germany") #adding a title to it
## Warning: Removed 6 rows containing missing values (geom_point).

so,lets get rid of all the points so the graph line is properly visible. From 13 March, German states mandated school and kindergarten closures, postponed academic semesters and prohibited visits to nursing homes. Two days later, borders to so countries were closed.By 22 March, curfews were imposed in six German states while other states prohibited physical contact. On 15 April,a first loosening of restrictions was announced,continued in early May,and eventually, holiday travels were allowed in cooperation with other European countries. Therefor two vertical lines are added to show the two days on which restrictions were imposed and one on which restrictions were relaxed.

apple_mobility_new %>% filter(`Country.region`== "Germany")%>%
  ggplot(aes(x=date,y=mobility,group=transportation_type))+
  geom_line(aes(color=transportation_type),size = 1)+
  geom_vline(xintercept=as.numeric(ymd("2020-03-13")), linetype="dashed", 
                color = "blue", size=0.5)+ #for adding 1st vertical line for the date 13th of march when the restrictions were imposed in Germany
  geom_vline(xintercept=as.numeric(ymd("2020-04-15")), linetype="dashed", 
                color = "blue", size=0.5)##for adding 2st vertical line for the date 15th of April when the restrictions were relaxed and travelling was allowed in Germany

  labs(title = "Germany's driving changes")
## $title
## [1] "Germany's driving changes"
## 
## attr(,"class")
## [1] "labels"

As we have seen in our data frame of apple_mobility that the mobility of all the counties and its different means of transportation is started from January 13th for which the mobility rate was 100%.lets add a reference line to the plot with y intercept value of 100 which is the baseline of our data. we will do it by using function geom_line.

apple_mobility_new %>% filter(`Country.region`== "Germany")%>%
  ggplot(aes(x=date,y=mobility,group=transportation_type))+
  geom_line(aes(color=transportation_type))+
  geom_hline(aes(yintercept = 100))+ #foe adding a baseline of 100 which is the value of Januray 13th and whith which the significant rise or drop in our graph is compared to
  geom_vline(xintercept=as.numeric(ymd("2020-03-13")), linetype="dashed", 
                color = "blue", size=0.5)+ #for adding 1st vertical line for the date 13th of march when the restrictions were imposed in Germany
  geom_vline(xintercept=as.numeric(ymd("2020-04-15")), linetype="dashed", 
                color = "blue", size=0.5)+##for adding 2st vertical line for the date 15th of April when the restrictions were relaxed and travelling was allowed in Germany
  labs(title = "changes of mobility in Germany",caption = "Figure1.Mobility changes inn Germany,Data:apple_mobility")

Three corresponding smooth lines were added to get the changes in mobility of the different transportation type as three single line.It help us to see how drastically after 13th of March there is a significant drop in all three modes of transportation till April and after that the graph remain somewhat stable . But after 12th of April there is a significant increase in mobility and that may be due to the loosening of restriction in Germany.

apple_mobility_new %>% filter(`Country.region`== "Germany")%>%
  ggplot(aes(x=date,y=mobility,group=transportation_type))+
  geom_line(aes(color=transportation_type))+
  stat_smooth(aes(color=transportation_type),
              method = "loess"
  )+
  geom_hline(aes(yintercept = 100))+ #foe adding a baseline of 100 which is the value of Januray 13th and whith which the significant rise or drop in our graph is compared to
  geom_vline(xintercept=as.numeric(ymd("2020-03-13")), linetype="dashed", 
                color = "blue", size=0.5)+ #for adding 1st vertical line for the date 13th of march when the restrictions were imposed in Germany
  geom_vline(xintercept=as.numeric(ymd("2020-04-15")), linetype="dashed", 
                color = "blue", size=0.5)##for adding 2st vertical line for the date 15th of April when the restrictions were relaxed and traveling was allowed in Germany
## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 6 rows containing non-finite values (stat_smooth).

Area plot

We can also show the above graph as area plot.

apple_mobility_new %>% filter(`Country.region`== "Germany")%>%
  ggplot(aes(x=date,y=mobility,group=transportation_type))+
  geom_area(aes(color=transportation_type,fill=transportation_type),
            alpha = 0.5, position = position_dodge(5))+
  geom_hline(aes(yintercept = 100))+
  geom_vline(xintercept=as.numeric(ymd("2020-03-13")), linetype="dashed", 
                color = "blue", size=0.5)+
  geom_vline(xintercept=as.numeric(ymd("2020-04-15")), linetype="dashed", 
                color = "blue", size=0.5)
## Warning: position_dodge requires non-overlapping x intervals

we saw the changes in mobility overall in Germany.Now let us check how the mobility is changing inside Germany by graphing the mobility of all the cities in Germany listed in our data . We can see there are 20 cities listed in our data for the country Germany.

apple_mobility %>% filter(geo_type=="city",country=="Germany",transportation_type=="driving")

Line plot of all the cities in Germany for DRIVING

apple_mobility_new %>% filter(geo_type=="city",country=="Germany",transportation_type=="driving")%>%
  ggplot(aes(x=date, y=mobility,group_by("Country.region"))) + 
  geom_line(aes(color=Country.region))+
  geom_hline(aes(yintercept = 100))+ #foe adding a baseline of 100 which is the value of Januray 13th and whith which the significant rise or drop in our graph is compared to
  geom_vline(xintercept=as.numeric(ymd("2020-03-13")), linetype="dashed", 
                color = "blue", size=0.5)+ #for adding 1st vertical line for the date 13th of march when the restrictions were imposed in Germany
  geom_vline(xintercept=as.numeric(ymd("2020-04-15")), linetype="dashed", 
                color = "blue", size=0.5)##for adding 2st vertical line for the date 15th of April when the restrictions were relaxed and travelling was allowed in Germany

  labs(title = "Germany's changes of driving in differnt cities")
## $title
## [1] "Germany's changes of driving in differnt cities"
## 
## attr(,"class")
## [1] "labels"

Line plot of all the cities in Germany for Walking

apple_mobility_new %>% filter(geo_type=="city",country=="Germany",transportation_type=="walking")%>%
  ggplot(aes(x=date, y=mobility,group_by("Country.region"))) + 
  geom_line(aes(color=Country.region))+
  geom_vline(xintercept=as.numeric(ymd("2020-03-13")), linetype="dashed", 
                color = "blue", size=0.5)+
  geom_vline(xintercept=as.numeric(ymd("2020-04-15")), linetype="dashed", 
                color = "blue", size=0.5)+
  labs(title = "Germany's changes of walking in differnt cities")

Bargraph

apple_mobility_new %>% filter(Country.region=="Germany",transportation_type=="walking")%>%
  ggplot(aes(x = date, y = mobility)) +
  geom_bar(stat = "identity", fill = "purple")+
  scale_x_date(breaks = '30 day') +
   theme(axis.text.x = element_text(angle = 90, vjust = 0.5))
## Warning: Removed 2 rows containing missing values (position_stack).

Line plot of all the cities in Germany for transit

apple_mobility_new %>% filter(geo_type=="city",country=="Germany",transportation_type=="transit")%>%
  ggplot(aes(x=date, y=mobility,group_by("Country.region"))) + 
  geom_line(aes(color=Country.region))+
  labs(title = "Germany's changes of transit in differnt cities")

Line plot of five selected cities in Germany

apple_mobility_new %>% filter(geo_type == "city",transportation_type == "driving",
         Country.region %in% c("Berlin","Hamburg","Frankfurt", "Colonge", "Dusseldorf", "Munich","Stuttgart"))%>%
  ggplot(aes(x=date, y=mobility,group_by("Country.region"))) + 
  geom_line(aes(color=Country.region))+
  geom_vline(xintercept=as.numeric(ymd("2020-03-13")), linetype="dashed", 
                color = "blue", size=0.5)+
  geom_vline(xintercept=as.numeric(ymd("2020-04-15")), linetype="dashed", 
                color = "blue", size=0.5)

  labs(title = "Germany's changes of mobiliy by walking in differnt cities")
## $title
## [1] "Germany's changes of mobiliy by walking in differnt cities"
## 
## attr(,"class")
## [1] "labels"
apple_mobility_new %>% filter(geo_type == "city",transportation_type == "walking",
         Country.region %in% c("Berlin","Hamburg","Frankfurt", "Colonge", "Dusseldorf", "Munich","Stuttgart"))%>%
  ggplot(aes(x=date, y=mobility,group_by("Country.region"))) + 
  geom_line(aes(color=Country.region))+
  geom_vline(xintercept=as.numeric(ymd("2020-03-13")), linetype="dashed", 
                color = "blue", size=0.5)+
  geom_vline(xintercept=as.numeric(ymd("2020-04-15")), linetype="dashed", 
                color = "blue", size=0.5)+
  labs(title = "Germany's changes of mobiliy by walking in differnt cities")

apple_mobility_new %>% filter(geo_type == "city",transportation_type == "transit",
         Country.region %in% c("Berlin","Hamburg","Frankfurt", "Colonge", "Dusseldorf", "Munich","Stuttgart"))%>%
  ggplot(aes(x=date, y=mobility,group_by("Country.region"))) + 
  geom_line(aes(color=Country.region))+
  geom_vline(xintercept=as.numeric(ymd("2020-03-13")), linetype="dashed", 
                color = "blue", size=0.5)+
  geom_vline(xintercept=as.numeric(ymd("2020-04-15")), linetype="dashed", 
                color = "blue", size=0.5)+
  labs(title = "Germany's changes of mobiliy by walking in differnt cities")

Mobility in 5 major cities in Germany

vec_brks <- c(-50, 0, 50)
vec_labs <- vec_brks + 100
apple_mobility_new%>%
  filter(geo_type == "city",transportation_type == "driving",
         Country.region %in% c("Berlin","Frankfurt", "Hannover", "Hamburg", "Munich","Stuttgart"))%>%
  mutate(over_under= mobility < 100 ,
         mobility = mobility-100)%>%
   ggplot(mapping = aes(x= date, y = mobility, 
                       group = Country.region, color = over_under)) + 
  geom_hline(yintercept = 0, color = "brown") + 
  geom_col() + 
  scale_y_continuous(breaks = vec_brks, labels = vec_labs) + 
  scale_color_manual(values = c("darkgreen", "blue")) +
  facet_wrap(~ Country.region, ncol = 1) + 
  guides(color = FALSE) + 
  labs(x = "Date", y = "Relative Mobility", title = "Relative Mobility in Apple Maps Usage for Driving in Selected Cities in Germany", 
                              subtitle = "Data are indexed to 100 for each city's usage on January 13th 2020", 
       caption = "Data: Apple_mobility_Graph") + 
  theme_minimal()
## Warning: Removed 12 rows containing missing values (position_stack).

Heatmap-Germany

apple_mobility_new %>% filter(`country`== "Germany",`transportation_type`== "driving")%>%
  ggplot(aes(x=date,y=Country.region,fill = mobility))+
  geom_tile()+
  scale_fill_distiller(palette = "OrRd")+
  labs(title = "Changes of mobility-DRIVING",caption = "Figure2.Heatmap of Germany,Data: Apple_mobility") 

apple_mobility_new %>% filter(`country`== "Germany",`transportation_type`== "walking")%>%
  ggplot(aes(x=date,y=Country.region,fill = mobility))+
  geom_tile()+
  scale_fill_distiller(palette = "blue")+
  labs(title = "Changes of mobility-WALKING",caption = "Figure2.Heatmap of Germany,Data: Apple_mobility")
## Warning in pal_name(palette, type): Unknown palette blue

selected region 2- INDIA

Earlier we have chosen Germany as our first region of interest where there was a partial lockdown and saw the mobility changes.Now we will select another region ,India where there was a complete lockdown by Government on 1.2B people for months before the COVID situation become more worse. so let us see how the mobility in India changes.

India_mobility<-apple_mobility_new %>% filter(`Country.region`== "India")

Scattered plot

apple_mobility_new %>% filter(`Country.region`== "India")%>%
  ggplot(aes(x = date, y = mobility, colour = transportation_type)) +
  geom_point() +
  scale_color_hue()+
  labs(title = "Scatter plot of mobility trends- INDIA",caption = "Figure2.Mobility trend of India,Data: Apple_mobility")
## Warning: Removed 4 rows containing missing values (geom_point).

DESCRIPTIVE STATISTICS- INDIA

For driving mode of transportation the minimum value is 22.82 which is the value of 4 April 2020 and the maximum is 131.52 which is the mobility of 8 Feb 2020.It can be clearly seen in the line and Boxplot plot in the later graphs.

For driving as mode of transportation the minimum value is 16.30 which is the value of 5 April 2020 and the maximum is 141.22 which is the mobility of 15th Feb 2020.

india_walk<-India_mobility%>%filter(transportation_type=="walking")
india_drive<-India_mobility%>%filter(transportation_type=="driving")
summary(india_walk$mobility)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   22.82   32.49   53.03   66.40  108.00  131.52       2
summary(india_drive$mobility)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   16.30   27.55   55.95   66.16  108.12  141.22       2

Boxplot- INDIA

boxplot(India_mobility$mobility ~ India_mobility$transportation_type,col = c("red","palevioletred1"))

Dot plot-INDIA

dotplot(India_mobility$mobility ~ India_mobility$transportation_type)

Histogram- INDIA

hist(India_mobility$mobility,col="black")

lineplot - INDIA

apple_mobility_new %>% filter(`Country.region`== "India")%>%
  ggplot(aes(x=date,y=mobility,group=transportation_type))+
  geom_line(aes(color=transportation_type))+
  geom_hline(aes(yintercept = 100))+
  geom_vline(xintercept=as.numeric(ymd("2020-03-13")), linetype="dashed", 
                color = "blue", size=0.5)+
  geom_vline(xintercept=as.numeric(ymd("2020-05-31")), linetype="dashed", 
                color = "blue", size=0.5)+
  labs(title = "Changes of mobility in India",caption = "Figure2.Line plot of mobility in India,Data: Apple_mobility") 

Line plot of all the cities in INDIA

On the evening of 24 March 2020, the Government of India ordered a nationwide lockdown for 21 days, limiting movement of the entire 1.38 billion population of India as a preventive measure against the COVID-19 pandemic in India.The lock-down was further extended till 31 May,with a conditional relaxations after 20 April for the regions where the spread had been contained or was minimal.Therefore we used these two dates to draw two vertical lines for depicting the three lock-down phases in India. Let us see how mobility in major cities in India changes.

#GRAPH1-DRIVING
apple_mobility_new %>% filter(geo_type=="city",country=="India",transportation_type=="driving")%>%
  ggplot(aes(x=date, y=mobility,group_by("Country.region"))) + 
  geom_line(aes(color=Country.region))+
  geom_hline(aes(yintercept = 100))+
  geom_vline(xintercept=as.numeric(ymd("2020-03-22")), linetype="dashed", 
                color = "blue", size=0.5)+
  geom_vline(xintercept=as.numeric(ymd("2020-05-04")), linetype="dashed", 
                color = "blue", size=0.5)+
  labs(title = "mobility changes of driving in differnt cities of India")

#GRAPH2-WALKING
  apple_mobility_new %>% filter(geo_type=="city",country=="India",transportation_type=="walking")%>%
  ggplot(aes(x=date, y=mobility,group_by("Country.region"))) + 
  geom_line(aes(color=Country.region))+
  geom_hline(aes(yintercept = 100))+
  geom_vline(xintercept=as.numeric(ymd("2020-03-22")), linetype="dashed", 
                color = "blue", size=0.5)+
  geom_vline(xintercept=as.numeric(ymd("2020-05-04")), linetype="dashed", 
                color = "blue", size=0.5)+
  labs(title = "mobility changes of walking in differnt cities of India")

GRAPH1-DRIVING

vec_brks <- c(-50, 0, 50)
vec_labs <- vec_brks + 100
apple_mobility_new%>%
  filter(geo_type == "city",transportation_type == "driving",
         Country.region %in% c("Bangalore","Delhi", "Chennai", "Hyderabad", "Mumbai","Pune"))%>%
  mutate(over_under= mobility < 100 ,
         mobility = mobility-100)%>%
   ggplot(mapping = aes(x= date, y = mobility, 
                       group = Country.region, color = over_under)) + 
  geom_hline(yintercept = 0, color = "black") + 
  geom_col() + 
  scale_y_continuous(breaks = vec_brks, labels = vec_labs) + 
  scale_color_manual(values = c("red", "orange")) +
  facet_wrap(~ Country.region, ncol = 1) + 
  guides(color = FALSE) + 
  labs(x = "Date", y = "Relative Mobility", title = "Relative Mobility in Apple Maps Usage for Driving in Cities in India", 
                              subtitle = "Data are indexed to 100 for each city's usage on January 13th 2020", 
       caption = "Data: Apple_mobility_Graph") + 
  theme_minimal()
## Warning: Removed 12 rows containing missing values (position_stack).

GRAPH2-WALKING

vec_brks <- c(-50, 0, 50)
vec_labs <- vec_brks + 100
apple_mobility_new%>%
  filter(geo_type == "city",transportation_type == "walking",
         Country.region %in% c("Bangalore","Delhi", "Chennai", "Hyderabad", "Mumbai","Pune"))%>%
  mutate(over_under= mobility < 100 ,
         mobility = mobility-100)%>%
   ggplot(mapping = aes(x= date, y = mobility, 
                       group = Country.region, color = over_under)) + 
  geom_hline(yintercept = 0, color = "black") + 
  geom_col() + 
  scale_y_continuous(breaks = vec_brks, labels = vec_labs) + 
  scale_color_manual(values = c("red", "blue")) +
  facet_wrap(~ Country.region, ncol = 1) + 
  guides(color = FALSE) + 
  labs(x = "Date", y = "Relative Mobility", title = "Relative Mobility in Apple Maps Usage for walking in Cities in India", 
                              subtitle = "Data are indexed to 100 for each city's usage on January 13th 2020", 
       caption = "Data: Apple_mobility_Graph for walking") + 
  theme_minimal()
## Warning: Removed 12 rows containing missing values (position_stack).

Heatmap For India

a.Transportation type-Driving

apple_mobility_new %>% filter(`country`== "India",`transportation_type`== "driving")%>%
  ggplot(aes(x=date,y=Country.region,fill = mobility))+
  geom_tile()+
  scale_fill_distiller(palette = "RdPu") 

b.Transportation type-Walking

apple_mobility_new %>% filter(`country`== "India",`transportation_type`== "walking")%>%
  ggplot(aes(x=date,y=Country.region,fill = mobility))+
  geom_tile()+
  scale_fill_distiller(palette = "YlOrRd")

Comparing our region of interest with other major cities worldwide

Mobility trends of 5 different cities

we can see few more major cities in the world and the way how transportation is being effected there due to covid. We chose two cities from our selected region earlier i.e Berlin from Germany and Delhi from India and comparing it with 3 other cities that is new york, paris and London.

vec_brks <- c(-50, 0, 50)
vec_labs <- vec_brks + 100
apple_mobility_new%>%
  filter(geo_type == "city",transportation_type == "driving",
         Country.region %in% c("Berlin","New York City", "Paris", "Delhi", "London"))%>%
  mutate(over_under= mobility < 100 ,
         mobility = mobility-100)%>%
   ggplot(mapping = aes(x= date, y = mobility, 
                       group = Country.region, color = over_under)) + 
  geom_hline(yintercept = 0, color = "gray40") + 
  geom_col() + 
  scale_y_continuous(breaks = vec_brks, labels = vec_labs) + 
  scale_color_manual(values = c("firebrick", "green")) +
  facet_wrap(~ Country.region, ncol = 1) + 
  guides(color = FALSE) + 
  labs(x = "Date", y = "Relative Mobility", title = "Relative Mobility in Apple Maps Usage for Driving in Selected Cities", 
                              subtitle = "Data are indexed to 100 for each city's usage on January 13th 2020", 
       caption = "Figure3.Mobility trends in differnt city across the world,Data: Apple_mobility_Graph") + 
  theme_minimal()
## Warning: Removed 10 rows containing missing values (position_stack).

Heatmap for Paris ,London , New york

With the use of heatmap we can clearly see the mobility change in these cities i.e Paris,London,New york

apple_mobility_new%>%
  filter( Country.region == "Paris")%>%
  ggplot(aes(x=date,y=Country.region,fill = mobility))+
  geom_tile()+
  scale_fill_distiller(palette = "YlOrBr")

apple_mobility_new%>%
  filter( Country.region == "London")%>%
  ggplot(aes(x=date,y=Country.region,fill = mobility))+
  geom_tile()+
  scale_fill_distiller(palette = "YlOrBr")

apple_mobility_new%>%
  filter( Country.region == "New York City")%>%
  ggplot(aes(x=date,y=Country.region,fill = mobility))+
  geom_tile()+
  scale_fill_distiller(palette = "YlOrBr")

Results

The apple mobility data help us to view the transportation used by people in different countries over the period of 6 months i.e January to July including three phases i.e the pre lock-down phase, lock-down phase and post lock-down/lock-down continuation phase.Different countries adapt different means for this containment of the virus , some country use partial lock-down , some dint put any lock-down but impose some restrictions while in some countries like INDIA there was a complete lock-down.We saw how the mobility changes in few cities and compared.Regardless of any city or country it was seen that after 11th of March ,declaration of COVID as world pandemic by WHO , the mobility gradually decreased till April.With slight relaxing of restriction there was a small increase in mobility with time till july but it was somewhat still lesser than the baseline value of 13th January for some country.

With the help of data visualization, it becomes easier to understand the trends and therefore draw a better inference of the data. Thereby, giving organizations an edge over the rivals. In business many a time it happens that we have to compare the performances of two elements or two scenarios. A traditional approach would be time consuming and tidious and obviously is difficult to go through the bulky data of both the situations and then analyze it. A better way to solve this problem will be putting the data of both the aspects into pictorial form. This will surely give a better understanding of the situations.

The Internet of things(IOT) describes the network of physical objects that are embedded with sensors, software, and other technologies for the purpose of connecting and exchanging data with other devices and systems over the Internet.

With the help of our devices we can help the government or different countries or the health authorities to gather information about people and by collection of their data so that we can track how many poeple are going outside and how much they are inside house and with who all they are coming in contact with.

Discussion

pros:

There are several pros and cons of analysis of data.Almost every business, today, relies on numbers or data. A lot of data is considered and researched before taking any decision. It’s not always very easy to understand the complex data that is presented in front of you for the evaluation of certain issues. A human brain may understand data but it way too exhausting to deduce anything from it.Even after this, we dont have a complete understanding of the situation?A better solution to this can be a representation of data in pictorial form. It’s quick, clear, and easy to understand.

Data visualization is where raw data is taken into consideration and then converted into graphs, charts, and table and other pictorial representation to present a visual image of data. Data visualization can help to convert vast data into the pictorial form which can be easily understood. The amount of time one takes to understand the complex data is reduced to a great extent when the same data is present in the pictorial form.Data visualization has become an inseparable part of the modern decision-making process.

Cons: Data visualization gives an estimation not accuracy of data.

The data is influenced by human decision such as considering the important section of data or the data that needs focus and may exclude the rest of the data which might lead to biased results.

The person fetching the data for the same may only take into consideration the important section of data or the data that needs focus and may exclude the rest of the data which might lead to biased results.

Different people may interpret and visualize the data differently.

If data visualization is considered a new sort of communication. Then it has to be self explanatory of the purpose. If the design is not proper then this can lead to confusion in communication.

The audience may not focus on the main purpose of the visualization which may lead to miscommunication or confusion.

Conclusion Data visualization in today’s business world can’t be ignored. With the addition of tons of data, it has proved to be a sigh of relief for modern analysts.

Conclusions

what else could be done or what the results could be used for?

The IOT companies can take advantage of the situation and make use of their sensor to track people. where they are going in this pandemic and with who all they came in contact with. This method will help in the containment of the virus and protect people.

The Online business market is not a new thing today. Every time during certain festive seasons the graphs of online businesses go up. In the pandemic situation where people cant go out ,they might enjoy doing things in home itself that includes shopping too. Selling things over internet can be useful during this time as people can access it easily and get it delivered.

The data can be used to predict upcoming hotspots, prepare hospitals to ensure they have the supplies they may need, and create trend models that could help determine when we will be able to open the country back up again

Citations:

Soetewey, A. (2020, December 31). Descriptive statistics in R - Towards Data Science. Medium. https://towardsdatascience.com/descriptive-statistics-in-r-8e1cad20bf3a

Veronesi, F. (2013, June 6). Box-plot with R – Tutorial. R-Bloggers. https://www.r-bloggers.com/2013/06/box-plot-with-r-tutorial/

Holtz, Y. (2013). R Color Brewer’s palettes. Https://Www.r-Graph-Gallery.Com/38-Rcolorbrewers-Palettes.Html. https://www.r-graph-gallery.com/38-rcolorbrewers-palettes.html

Reshaping Your Data with tidyr · UC Business Analytics R Programming Guide. (n.d.). Https://Uc-r.Github.Io/Tidyr. Retrieved July 7, 2021, from https://uc-r.github.io/tidyr

S. (n.d.). R Basics. Smoothing! Retrieved July 7, 2021, from http://statseducation.com/Introduction-to-R/modules/graphics/smoothing/

L. (2020, April 20). R Pie Chart – Base Graph. Learn By Example. https://www.learnbyexample.org/r-pie-chart-base-graph/

Mobility Data from Apple. (n.d.). Https://Kjhealy.Github.Io/Covdata/Articles/Mobility-Data.Html. Retrieved July 7, 2021, from https://kjhealy.github.io/covdata/articles/mobility-data.html

ggplot2 line plot : Quick start guide - R software and data visualization - Easy Guides - Wiki - STHDA. (n.d.). Http://Www.Sthda.Com/English/Wiki/Ggplot2-Line-Plot-Quick-Start-Guide-r-Software-and-Data-Visualization. Retrieved July 7, 2021, from http://www.sthda.com/english/wiki/ggplot2-line-plot-quick-start-guide-r-software-and-data-visualization

N. (2020, April 22). Plot with ggplot. RStudio Community. https://community.rstudio.com/t/plot-with-ggplot/62604/14

Hoffman, J. (2021, May 27). Pros and Cons of Data Visualization. WisdomPlexus. https://wisdomplexus.com/blogs/pros-cons-data-visualization/

U. (2020b, June 26). R studio: error gather function. RStudio Community. https://community.rstudio.com/t/r-studio-error-gather-function/71041

THANK YOU